Managing Massive Time Series Streams with MultiScale Compressed Trickles

نویسندگان

  • Galen Reeves
  • Jie Liu
  • Suman Nath
  • Feng Zhao
چکیده

We present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying on the sparsity, the time series streams can be archived with reduced storage space. We then show that many statistical queries such as trend, histogram and correlations can be answered directly from compressed data rather than from reconstructed raw data. Our evaluation with server utilization data collected from real data centers shows significant benefit of our framework.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cypress: Managing Massive Time Series Streams with Multi-Scale Compressed Trickles

We present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying on the sparsity, the time series streams can be archived wi...

متن کامل

Massive Data Streams Research: Where to Go

This phenomenon has challenged how we store, communicate and compute with data. Theories developed over past 50 years have relied on full capture, storage and communication of data. Instead, what we need for managing modern massive data streams are new methods built around working with less. The past 10 years have seen new theories emerge in computing (data stream algorithms), communication (co...

متن کامل

A Sketch-based Clustering Algorithm for Uncertain Data Streams

Due to the inaccuracy and noisy, uncertainty is inherent in time series streams, and increases the complexity of streams clustering. For the continuous arriving and massive data size, efficient data storage is a crucial task for clustering uncertain data streams. With hash-compressed structure, an extended uncertain sketch and update strategy are proposed to store uncertain data streams. And ba...

متن کامل

Closed-Loop MPEG Video Rendering

Closed-loop video rendering implies generating video streams in compressed form given as input video streams that are in compressed form such as JPEG or MPEG bit streams. Closed-loop video rendering has signi cant advantages over open-loop schemes including reduced storage requirements which in turn reduces memorydisk swapping time that constitutes a large portion of the time spent in the deskt...

متن کامل

Sequential Pattern Mining for Uncertain Data Streams using Sequential Sketch

Uncertainty is inherent in data streams, and present new challenges to data streams mining. For continuous arriving and large size of data streams, modeling sequences of uncertain time series data streams require significantly more space. Therefore, it is important to construct compressed representation for storing uncertain time series data. Based on granules, sequential sketches are created t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2009